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SUMMARY 

RPK Corporation has converted COSMIC /NASTRAN to the CRAY 
computer systems. The CRAY version is currently available and 
provides users with access to all of the machine— independent 
source code of COSMIC/NASTRAN . Future releases of COSMIC/NASTRAN 
will be made available on the CRAY by RPK soon after they are 
released by COSMIC. 


INTRODUCTION 

RPK Corporation has converted COSMIC/NASTRAN to the CRAY 
computers that operate under the CRAY operating system (COS) . 
RPK believes that NASTRAN users with CRAY computers desire to 
have COSMIC/NASTRAN available to them. With RPK's CRAY version, 
users have access to all of the features in the current release 
of COSMIC/NASTRAN. These features include not only the analysis 
capabilities offered by NASTRAN, but also the availability of the 
machine-independent source code, thereby giving users the freedom 
and ability for incorporating in-house modifications and 
enhancements to NASTRAN. It is RPK's commitment to make 
available and maintain future releases of COSMIC/NASTRAN on the 
CRAY. RPK will ensure that the CRAY version will always have all 
of the capabilities available on the latest COSMIC-maintained 
versions of NASTRAN. 


ADVANTAGES OF A CRAY COMPUTER 

The CRAY computer is established as one of the fastest 
computers in the world. The CRAY computer employs a pipeline 
architecture with scalar and vector processing capabilities 
(Reference 1) . It is capable of a peak computational speed of at 
least 100 million floating point operations per second and has a 
central-memory bandwidth of one word per 12.5 nanoseconds, or 80 
million words per second. The CRAY computer is highly compact 
and, because of this, signals can be carried from point to point 
in it at the velocity attainable with ordinary copper wire: 
about three-tenths the speed of light. The CRAY also has a very 
fast scalar speed. This scalar speed is a very dominant factor 
for programs that are not optimized for vectorization or for 
programs that do not lend themselves for significant vector 
optimization. 
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Figure 1 shows a generalized block diagram of the 
architecture of a vector computer similar to the CRAY computer. 
In this diagram, the instruction registers are read and processed 
by the pipelined instruction processor and the scalar registers 
are used by the pipelined scalar arithmetic and logic unit. The 
vector processor performs all vector processes. On the CRAY 

computer, there are five groups of registers: 8 address 

registers, 64 intermediate address registers, 8 scalar registers, 
64 intermediate scalar registers and 8 vector registers 
containing 64 words each. In addition, the CRAY has 4 sets of 
16-word buffers used for storing instructions. 

Figure 2 shows the vector processor of a CRAY computer. It 
includes seven special-purpose pipelined units for executing 
specific functions. Three are shared with the CRAY'S scalar 
processor. Several of the units can work concurrently on 
different vector operations. Vector data stream from the eight 
vector registers, through the functional units and back to 
registers. The steering module switches operands from the 
registers to the functional units and back again to the 
registers. While some registers are serving as sources or 
destinations of vector operations, others can be transferring 
data to or from central memory. Because of the 
register— to— register streaming of vectors, pipelines are short 
and start-up overhead is small. 

One consequence of the register-to-register streaming of 
vectors is that the curve of efficiency (megaflops or millions of 
floating point operations per second versus vector length) shows 
peaks at vector lengths that are multiples of 64. This is shown 
in Figure 3. The peaks at vector lengths of 64 and 128 are there 
because there are 64 words in each set of vector registers and 
the CRAY operates most efficiently when all of these words are 
used. The curve drops off after 64 because of the time it takes 
to reload the registers with the next data to be processed 
(Reference 2) . 


DESIGN OF THE CRAY VERSION 

The design of the CRAY version of COSMIC/NASTRAN is similar 
to that of the DEC VAX version. There are fifteen programs that 
correspond to the fifteen standard NASTRAN links. None of these 
programs contains an overlay structure. The fifteen programs 
dynamically chain themselves through the use of conditional job 
control language (JCL) (Reference 3) . The I/O is designed to 
automatically allow for logical file extensions to additional 
physical files if space is exhausted on any given external file. 
This will ensure that no jobs are lost due to space limitations 
on one file. 
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RPK has designed the CRAY version to allow for easy 
maintenance and growth. There is no need for a special linkage 
editor nor for any other special software utilities other than 
those provided by COS. Users can update and modify the CRAY 
version using such standard CRAY-supplied utilities as BUILD 
(Reference 3) and UPDATE (Reference 4) . The design also readily 
lends itself to the use of the Fortran Flow Trace capability 
(Reference 5) . This capability is of immense help in accurately 
evaluating the performance of the code and in determining the 
areas of the code where improvements can be made using 
optimization techniques that will obtain the most benefits. 


OPTIMIZATION OF THE CRAY VERSION 

Several important areas of code in RPK ' s CRAY version have 
been optimized by using the vectorization techniques available on 
the CRAY. These include the decomposition, forward/backward 
substitution and multiply/add routines, certain eigenvalue 
extraction routines and others. The reduction in CPU times 
resulting from optimization in these areas of code has ranged 
from a minimum of about 50% to as high as 99%. RPK is committed 
to optimizing the entire spectrum of capabilities in 
COSMIC/NASTRAN . However, RPK regards this work as an continuing 
activity and expects to optimize the bulk of NASTRAN code in the 
near future. 


CONCLUDING REMARKS 

In developing the CRAY version of COSMIC/NASTRAN, RPK hopes 
to satisfy the needs of CRAY users who desire to use 
COSMIC/NASTRAN on the CRAY and may desire to have access to the 
machine-independent source code. RPK is fully committed to 
maintaining the CRAY version in such a manner as to be fully 
compatible and equal in capability with the latest 
COSMIC-maintained versions of NASTRAN. 
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Figure 1 


Generalized block diagram of a vector computer 
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Figure 2. Vector processor of a CRAY computer 
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Figure 3. Vector processing on the CRAY 
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